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(54) An audio processing system 



(57) An audio processing system (1) has real time 

recording and logging computers (10,15) which record 
sound and annotation text in real time. A sound file (SF) 
and a log file (LP) are associated with each take. The 
logging computer (15) uses automatic entries to a tag 
file on a sender (12) to maintain synchronism between 
the log and sound files. Timeline references are embed- 
ded as text within the log files. A transcription worksta- 



tion (1 6) retrieves the corresponding sound and log files 
and the log file may be exported to a word processing 
applicatbn with the embedded time line references to 
provide a template. The transcription workstation (16) 
correlates time with digital data strings to alk)w simple 
payt}ack using foot pedals in a conventkniai manner. It 
also albws selection of text using a graphical tool with 
automata searching to the associated sound file seg- 
ment. 
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Description 

[0001] The invention relates to an audio processing 
system for processing of audio proceedings of a setting 
sucli as a parliament or a Court chamber. 
[00021 Unites States Patent Specification No. 
5878186 describes such a system. A computer aided 
transcription (CAT) system operates in "virtual realtime" 
to generate a textual record of the proceedings. A court 
reporter produces "transition marlcers" at changes of 
speakers. Because cf pauses and court report time lags 
for transition markers, the text is often out of synchroni- 
sation with the audk> signals. Further analysis is per- 
formed to achieve synchronisation by kJentifying a dif- 
ferent time period. 

[0003] United States Patent Specificatk)n No. 

4924387 (Jeppesen) also describes use of a steno- 
graphic device and a controller which separates court 
reporter keystroke combinations into phonetic and con- 
trol keystrokes. Unites States Patent Specificatkxi No. 
5272571 (L. R. Linn and Associates) also describes si- 
multaneous recording of keystrokes and audio data. It 
also describes later transcript kxi. During processing, a 
table is generated which links stenotype keystrokes with 
fields, audio file pointers, and corresponding text 
strings. 

[0004] Such systems provide a good deal of textual 
information in near real time. However, where compre- 
hensive editing is required the systems are quite inflex- 
ible. Also, it appears that a large extent of manpower is 
required during the audk> proceedings. 
[0005] The invention is therefore directed towards 
providing a system whk:h both captures audk> proceed- 
ings in realtime in a comprehensive manner, and which 
allows comprehensive and flexible editing facilities. 
[0006] According to the hventton, there Is provided, 
an audio processing system for generation of transcripts 
from audio proceedings, the system comprising capture 
means for recording audio and corresponding textual 
data and transcription means for generating transcripts, 
characterised in that, 

the capture means comprises a logging means 
comprising means for generating a textual log file 
for audio proceedings and for embedding timeline 
references in the log file; arKt 

the transcription means comprises means for read- 
ing the embedded timeline references, for correlat- 
ing the timeline references to positbn data in an au- 
dio sound file, and for automatk:ally locating and 
playing back sound from the sound file in response 
to selection of indicia indicating timeline references 
in the log file. 

[0007] In one embodiment, the capture means com- 
prises a recording means comprising means for record- 
ing the audio proceedings in the sound file. 



[0008] Preferably, the recording means comprises 
means for simultaneously generating timeline data and 
making said data available to the logging means. 
[0009] In one emtxxliment, the recording means com- 

^ prises means for automatically generating the sound file 
on a server and for updating the sound file during the 
audio proceedings, and for writing said timeline data to 
a tag file on the server, and the logging means compris- 
es means for performing reads from the tag file on the 
sen/er during generation of the log file. 
[0010] Preferably, the recording means comprises 
means for automatrcally writing a current sound file 
name and an activity flag indicating if audio proceedings 
are about to begin, have begun, or have stopped, and 
the logging means comprises means for reading said 
flag and operating accordingly. 
[0011] In another embodiment, the recording means 
comprises means for writing an overlap flag to the tag 
file to indk^ate if the current sound is in an overlap perkxi 

£0 between sound takes. 

[0012] In a further embodiment, the logging means 
comprises means for automatically embedding timeline 
references upon input of a new speaker identity in the 
k>g file annotatbn inputs. 

[0013] Preferably, the logging means comprises 
means for recording Initial spoken words after identifi- 
cation data for a new speaker 
[001 4] In a further embodiment, the timeline referenc- 
es are represented as a symtx)l at the start of a display 
^ text line, the timeline reference being expanded upon 
selectkxi of the symbol. 

[0015] In another embodiment, the transcription 
means comprises means for exporting the log file to a 
word processor application and for playing back the 

W sound file at a position selected by a user using the word 
processor appl»atk3n for transcript editing. 
[001 6] Preferably, the transcription means comprises 
for automatrcally displaying a time counter for the cur- 
rent sound file position to allow user input of a timeline 

40 reference in a bg file or in a transcript file. 

[001 7] The invention will be more clearly understood 
from the folk>wing description of some embodiments 
thereof, given by way of example only with reference to 
the accompanying drawings In which:- 

Fig. 1 Is a schematic representation of an audk) 
processing system of the invention; 

Fig. 2 is a representatkxi of a tag file generated with- 
^ in the system; 

Fig. 3 Is a representatbn of part of a k>g file gener- 
ated within the system; and 

Fig. 4 is a representatbn of a part of a transcript 
generated by the system. 

[0018] Refening to the drawings, there is shown an 
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audio processing system 1 of the invention. The system 
1 comprises a real time section 2 and an off-line tran- 
scription section 3. 

[0019] A local area network 4 interconnects the vari- 
ous processing devices. These include a recording 
computer 10 which receives audio inputs from micro- 
phones 11 . The recording computer 1 0 accesses a serv- 
er 1 2 via the network 4. The server 1 2 has an audio stor- 
age area 13 and a transcript storage area 14. 
[0020] A logging computer 15 receives manual text 
annotatbn inputs and it also communicates with the 
sender 12. The recording computer 10 and the logging 
computer 15 are the primary real time devices within the 
system and lx>th access the server 12 and indeed ac- 
cess the same foklers or directories within the server 1 2 
storage structures. 

[0021] The off-line section 3 comprises a transcription 
workstatkxi 16, an editing workstation 17, a transcript 
printer 1 8, and a modem 1 9. The printer 18 is for printing 
of prepared transcripts of audio proceedings, and the 
modem 19 is for remote communicatk)n of transcripts. 
The network is in this embodiment a local area network, 
however, the various devices may be distributed more 
widely usin^ wkJe area networic technok>gy. Thus, the 
basic design of the system 1 is very flexible. 
[0022] The system 1 may be used for generatbn of 
transcripts of audio proceedings in a parliament chanrv 
ber or in a court, for. example. For illustrative purposes, 
it is presumed that the audio proceedings are in a par- 
liament chamber. These proceedings are broken into 
takes' each in this embodiment 10 minutes long. The 
recording computer 1 0 and the logging computer 1 5 op- 
erate in real time to capture data and log it into the server 
12. 

[0023] The recording computer 1 0 sets up a sound file 
(SF) on the sender 12 in a particular folder within the 
audk) storage area 1 3, the folder being associated with 
a partk^ular session of audio proceedings such as one 
parliamentary day. A sound file is configured for storing 
digital audio data in a conventk>nal fonmat such as a '. 
wav' format. The recording computer 10 also has a 
clock which is reset at the start of the take and this is 
used to generate entries for a tag file which is also set 
up on the server 12. The tag file is continuously updated 
and is not associated with any particular take, but is as- 
sociated with an audio proceedings sessk)n. 
[0024] In Fig. 2, a tag file is indicated by the numeral 
30 and in this example it comprises the folbwing data 
fields:- 

A: The take name. These mn in alphabetical se- 
quence. 

137: This is a timeline reference, namely a time 
stamp of 1 37 seconds into the take. 

Flags: A first flag indrcates with a "0" that there is 
currently no overlap, and with a '1 ' that there is cur- 



rently overlap between takes. A second flag indi- 
cates with the word "set' that a take is about to be- 
gin, with the word "run" that a take is currently ac- 
tive, and with a word 'stop' that a take has ended. 

The start time of the take, in this case 15:17:00. 

[0025] The recording computer 10 operates a timer 
which correlates one second segments to eight kB of 
audio data. The recording computer 1 0 outputs a tag file 
update at six second intervals and so typically the time- 
line reference increments by six seconds with every up- 
date. Also, of course, the recording computer 1 0 outputs 
the sound bytes. The frequency for this output is one 
second segments, each with 8 kB. 
[0026] At the same time, the logging computer 1 5 op- 
erates in real time to generate a log file (LF) correspond- 
ing to the partrcular sound file (SF) in a one-to-one re- 
lationship. The log file is stored in the same folder or 
directory on the sender audio storage area 1 3. It is also 
kientified by the same name ("A") as the corresponding 
sound file. The logging computer 15 receives text anno- 
tation inputted by a reporter. A sample 35 is illustrated 
in Fig. 3. The k>gging computer 1 5 polls the tag file every 
two seconds until it detects a "run" flag, upon which it 
automatk^ally sets up a corresponding log file and stores 
it in the audio storage area 13. The log file is then up- 
dated with a new line giving a speaker identity and the 
first few words. In this embodiment, an update is per- 
formed for every 'Caaiage Return' keystroke. 
[0027] The tag file 30 acts as a dynamic link between 
the recording and logging computers 1 0 and 1 5 and this 
link is maintained as long as the audio proceedings con- 
tinue. Switch over to a new take is flagged by the overlap 
flag "0/1 " and the overlap period in this embodiment is 
15 seconds. Thus, each sound file records 15 seconds 
of the next take which provkjes sufficient time for roll- 
over. However, this time is user-configurable to a de- 
sired setting. By simply monitoring the tag file, the log- 
ging computer 15 generates the appropriate togging 
files in real time and records the relevant text annota- 
tions. Eventually, the "stop" flag will be written to the tag 
file, upon whk;h the logging computer 15 stops gener- 
ating new bg files. 

[0028] At the end of the session, there is a set of k>g 
and sound files, there being a single bg file and a single 
sound file associated with each individual take. An im- 
portant aspect of generation of the log file is that the 
logging computer 1 5 embeds a timeline reference in the 
bg file. Examples 36 are shown in Fig. 3. These three 
timeline references are <0>, <16>, and <22>. This in- 
fonmation is available to the k>gging computer from the 
tag file and it is automatbally embedded as text together 
with the text annotatbns which are inputted. 
fOOZS] When it is desired to generate a transcript for 
the audb proceedings session, the transcriptbn work- 
station 16 and the editing workstation 17 are operated. 
The transcription workstation 16 retrieves the sound and 
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log files for the session and opens corresponding sound 
and log files simultaneously. The transcription worksta- 
tion 16 has foot pedals which are equivalent to those of 
conventional audio tape dictating playback machines, 
namely Rewind, Play, and FonA^ard. An operator oper- ^ 
ates these pedals in a manner equivalent to using a dic- 
tatkxi and playback machine. Again, the transcription 
computer 16 associates a one second time perkxj with 
8 kB of sound data and it 'rewinds' or fast fonwards" 
through the relevant sound file by 8kB for each second m 
of depresskyi of the Rewind or Fast-Forward pedals. Al- 
so, the transcription workstation 16 recognises key- 
board or mouse selectk^ns of lines of text such as one 
of the lines illustrated in Fig. 3. The embedded timeline 
reference allows It locate the relevant position in the cor- 
responding sound file. For example, the third line illus- 
trated in Fig. 3 is 22 seconds into the take and therefore 
the workstatk)n starts playback at byte number 176 kB 
in the sound file. In this way, the operator can type in 
text as he or she listens to playback of the sound file. 
The location within the k>g file for typing the text is indi- 
cated by the name of the speaker followed by the first 
few words which are spoken. 
[0030] This workstation alsooperatestocutandpaste 
the log file into a third-party word processing applk^tkxi. 
Indeed, cutting and pasting of k>g files into a word proc- 
essor applicatk>n allows creation of an initial template 
for generation of a transcript. An important aspect of the 
embedded tags within the log file are that they are in a 
text format which is ported with the other text into the ^ 
word processor template. A sample template 40 is 
shown in Fig. 4. As the workstation 16 operates, it not 
only displays the text, but also displays either a symbol 
Indicating the timeline reference or the reference itself. 
The symbol may be a dot such as a dot 41 at the start 
of the line to Indtoate that exF>ansion of this dot displays 
the timeline reference. During execution of the transcrip- 
tion programs, the workstatkxi 16 automatrcally displays 
the time counter for the current text. This is indicated as 
00.16 In the example of Fig. 3. Thus, an operator who 
has both the transcription word processor file and the 
log file open can manually input a timeline reference in 
order to make these references more comprehensive in 
the transcript. These references are particularly impor- 
tant also for the editing workstation 17 as additkmal op- 
erators may provkle an input into a particular transcript, 
depending on their particular transcript skills. It must be 
borne In mind that parliamentary transcription is a highly 
skilled task and the system 1 albws input from a number 
of people in a simple and highly controlled and struc- v^-^ 
tured manner. 5. 
[0031] It will be appreciated that the invention pro- 
vides for generatbn of a transcript with very accurate 
real time recording of sound and annotation data and 
interlinking of sound, bgging, and transcript files in a 
manner whereby comprehensive and accurate tran- 
scripts may be generated with any required extent of ed- 
iting and Input from different people. At the same time. 



the system also allows re^shecking back to the sound 
files in a highly organised and etfk^ient manner. Even 
the final transcript product is correlated back tothe audio 
recording which was captured live. 
[0032] The inventkxi is not limited tothe embodiments 
described but may be varied in constructkxi and detail 
within the scope of the claims. 



Claims 

1. An audk> processing system (1) for generation of 
transcripts from audio proceedings, the system (1 ) 
comprising capture means for recording audk) and 
corresponding textual data and tran$crlptk>n means 
for generating transcripts, characterised in that, 

the capture means comprises a logging means 
(15) comprising means for generating a textual 
k)g file (LF) for audio proceedings and for em- 
bedding timeline references in the k>g file; and 

the transcription means (16) comprises means 
for reading the embedded timeline references, 
for correlating the timeline references to posi- 
tion data in an audk> sound file (SF), and for 
automatically locating and playing back sound 
from the sound file in response to selectk>n of 
indk;ia indicating timeline references in the log 
file. 

2. A system as claimed in claim 1 , wherein the capture 
means comprises a recording means (10) compris- 
ing means for recording the audk> proceedings in 
the sound file (SF). 

3. A system as claimed in claim 2, wherein the record- 
ing means (10) comprises means for simultaneous- 
ly generating timeline data and making saki data 
available to the k>gging means. 

4. A system as claimed in claim 3, wh erein the record- 
ing means (10) comprises means for automatk:ally 
generating the sound file on a server and for updat- 

"^^^^ ing the sound file during the audio proceedings, and 
for writing said timeline data to a tag file on the send- 
er, and the logging means comprises means for per- 
forming reads from the tag file on the server during 
generation of the log file. 

A system as claimed in claim 4, wherein the record- 
ing means comprises means for automatically writ- 
ing a current sound file name and an activity flag 
indicating if audk> proceedings are about to begin, 
have begun, or have stopped, and the logging 
means comprises nrteans for reading said flag and 
operating accordingly. 
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6. A system as claimed in claims 4 or 5, wherein the 
recording means comprises means for writing an 
overlap flag to the tag file to indicate if the current 
sound is in an overlap period between sound takes. 

7. A system as claimed in any preceding claim, where- 
in the bggrng means comprises means for automat- 
ically embedding timeline references upon input of 
a new speaker klentity in the log file annotatk>n in- 
puts. iO 

8. A system as claimed in any preceding claim, where- 
in the logging means comprises means for record- 
ing Initial spoken words after kJentification data for 
a new speaker. 



9. A system as claimed in any preceding claim, where- 
in the timeline references are represented as a sym- 
bol at the start of a display text line, the timeline ref- 
erence being expanded upon selection of the sym- 
bol. 



1 0. A system as claimed in any preceding claim, where- 
in the transcription means (16) comprises means 
for exporting the log file to a word processor appli- 
catkMi and for playing back the sound file at a posi- 
tbn selected by a user using the word processor 
application for transcript editing. 



1 1 . A system as claimed in any preceding claim, where- M 
in the transcription means comprises for automati- 
cally displaying a time counter for the current sound 

file position to allow user input of a timeline refer- 
ence in a k>g file or In a transcript file. 

3$ 

12. A system substantially as described with reference 
to the accompanying drawings. 



13. A computer program product directly loadable into 
the internal memory of a digital computer, and com- 
prising software code for implementing the capture 
means and the transcription means when sakJ prod- 
uct is run on a digital computer. 
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A 137 0 run 15:17:00 

[take name] [time line reference] [Flags] [take start 

time] 
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<0> Mr. A: The Minister will be aware 
<16> Minister: The legislation Report Stage 
<22> Chairman: The matter will 
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* Mr. A: The minister will be aware that we have been making 

representations in relation to this matter for some time. 

* Minister: The legislation Report Stage is due next week. It deals with 

all of the matters under review. 

* Chaimian: The matter will be discussed in the Chamber in full detail 

when the Report Stage is reached. 
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